Improved Dyna-Q: A Reinforcement Learning Method Focused via Heuristic Graph for AGV Path Planning in Dynamic Environments

نویسندگان

چکیده

Dyna-Q is a reinforcement learning method widely used in AGV path planning. However, large complex dynamic environments, due to the sparse reward function of and searching space, this has problems low search efficiency, slow convergence speed, even inability converge, which seriously reduces performance practicability it. To solve these problems, paper proposes an Improved algorithm for planning environments. First, problem global guidance mechanism based on heuristic graph, can effectively reduce space and, thus, improve efficiency obtaining optimal path. Second, Dyna-Q, novel action selection provide more intensive feedback efficient decision planning, improving algorithm. We evaluated our approach scenarios with static obstacles obstacles. The experimental results show that proposed obtain better paths efficiently than other reinforcement-learning-based methods including classical Q-Learning algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wp-dyna: Planning and Reinforcement Learning in Well-plannable Environments

Reinforcement learning (RL) involves sequential decision making in uncertain environments. The aim of the decision-making agent is to maximize the benefit of acting in its environment over an extended period of time. Finding an optimal policy in RL may be very slow. To speed up learning, one often used solution is the integration of planning, for example, Sutton’s Dyna algorithm, or various oth...

متن کامل

Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing-game strategy decision systems

In a Role-Playing Game, finding optimal trajectories is one of the most important tasks. In fact, the strategy decision system becomes a key component of a game engine. Determining the way in which decisions are taken (online, batch or simulated) and the consumed resources in decision making (e.g. execution time, memory) will influence, in mayor degree, the game performance. When classical sear...

متن کامل

The Z Method for Fast Path Planning in Dynamic Environments

We present a method to plan collision free paths for robots with any number of degrees of freedom in dynamic environments The method proved to be very e cient as it ommits a complete representation of the high di mensional search space Its complexity is linear in the number of degrees of freedom A preprocessing of the geometry data of the robot or the environment is not re quired With the time ...

متن کامل

Path Planning in Dynamic Environments

The motion planning problem for mobile robots is typically formulated as follows: given a robot and a description of an environment, plan a path of the robot between two specified locations, which is collision-free and satisfies certain optimization criteria. Traditionally there are two approaches to the problem: Off-line planning, which assumes perfectly known and stable environment, and on-li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Drones

سال: 2022

ISSN: ['2504-446X']

DOI: https://doi.org/10.3390/drones6110365